Selection and Information: A Class-Based Approach to Lexical Relationships

نویسنده

  • Philip Stuart Resnik
چکیده

Selectional constraints are limitations on the applicability of predicates to arguments. For example, the statement “The number two is blue” may be syntactically well formed, but at some level it is anomalous — BLUE is not a predicate that can be applied to numbers. According to the influential theory of (Katz and Fodor, 1964), a predicate associates a set of defining features with each argument, expressed within a restricted semantic vocabulary. Despite the persistence of this theory, however, there is widespread agreement about its empirical shortcomings (McCawley, 1968; Fodor, 1977). As an alternative, some critics of the Katz-Fodor theory (e.g. ( Johnson-Laird, 1983)) have abandoned the treatment of selectional constraints as semantic, instead treating them as indistinguishable from inferences made on the basis of factual knowledge. This provides a better match for the empirical phenomena, but it opens up a different problem: if selectional constraints are the same as inferences in general, then accounting for them will require a much more complete understanding of knowledge representation and inference than we have at present. The problem, then, is this: how can a theory of selectional constraints be elaborated without first having either an empirically adequate theory of defining features or a comprehensive theory of inference? In this dissertation, I suggest that an answer to this question lies in the representation of conceptual knowledge. Following Miller (1990b), I adopt a “differential” approach to conceptual representation, in which a conceptual taxonomy is defined in terms of inferential relationships rather than definitional features. Crucially, however, the inferences underlying the stored knowledge are not made explicit. My hypothesis is that a theory of selectional constraints need make reference only to knowledge stored in such a taxonomy, without ever referring overtly to inferential processes. I propose such a theory, formalizing selectional relationships in probabilistic terms: the selectional behavior of a predicate is modeled as its distributional effect on the conceptual classes of its arguments. This is expressed using the information-theoretic measure of relative entropy (Kullback and Leibler, 1951), which leads to an illuminating interpretation of what selectional constraints are: the strength of a predicate’s selection for an argument is identified with the quantity of information it carries about that argument. In addition to arguing that the model is empirically adequate, I explore its application to two problems. The first concerns a linguistic question: why some transitive verbs permit implicit direct objects (“John ate Ø”) and others do not (“*John brought Ø”). It has often been observed informally that the omission of objects is connected to the ease with which the object can be inferred. I have made this observation more formal by positing a relationship between selectional constraints and inferability. This predicts (i) that verbs permitting implicit objects select more strongly for (i.e. carry more information about) that argument than verbs that do not, and (ii) that strength of selection is a predictor of how often verbs omit their objects in naturally occurring utterances. Computational experiments confirm these predictions. Second, I have explored the practical applications of the model in resolving syntactic ambiguity. A number of authors have recently begun investigating the use of corpus-based lexical statistics in automatic parsing; the results of computational experiments using the present model suggest that many lexical relationships are better viewed in terms of underlying conceptual relationships. Thus the information-theoretic measures This thesis or dissertation is available at ScholarlyCommons: http://repository.upenn.edu/ircs_reports/200 proposed here can serve not only as components in a theory of selectional constraints, but also as tools for practical natural language processing. Disciplines Cognitive Neuroscience Comments University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-93-42. This thesis or dissertation is available at ScholarlyCommons: http://repository.upenn.edu/ircs_reports/200

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Impacts of Consciousness-raising in a Genre-based Pedagogy

This study reports on the findings of a genre teaching course for developing academic writing of a class of EFL students in Iran. The information report genre was taught in a cyclical way of teaching and learning, which was started from ‘setting the context’ and ‘deconstruction’ of prototype information report genre, and continued with ‘joint construction’, ‘independent construction’, and final...

متن کامل

Extracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem

Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue...

متن کامل

Automatic Detection of Non-deverbal Event Nouns for Quick Lexicon Production

In this work we present the results of experimental work on the development of lexical class-based lexica by automatic means. Our purpose is to assess the use of linguistic lexical-class based information as a feature selection methodology for the use of classifiers in quick lexical development. The results show that the approach can help reduce the human effort required in the development of l...

متن کامل

An artificial intelligence model based on LS-SVM for third-party logistics provider ‎selection

The use of third-party logistics (3PL) providers is regarded as new strategy in logistics management. The relationships by considering 3PL are sometimes more complicated than any classical logistics supplier relationships. These relationships have taken into account as a well-known way to highlight organizations' flexibilities to regard rapidly uncertain market conditions, follow core competenc...

متن کامل

A New Approach to Project Risk Responses Selection with Inter-dependent Risks

Risks are natural and inherent characteristics of major projects. Risks are usually considered independently in analysis of risk responses. However, most risks are dependent on each other and dependent risks are rare in the real world. This paper proposes a model for proper risk response selection from the responses portfolio with the purpose of optimization of defined criteria for projects. Th...

متن کامل

WordNet and Distributional Analysis: A Class-based Approach to Lexical Discovery

It has become common in statistical studies of natural language data to use measures of lexical association, such as the information-theoretic measure of mutual information, to extract useful relationships between words (e.g. [Church et al., 1989; Church and Hanks, 1989; Hindle, 1990]). For example, [Hindle, 1990] uses an estimate of mutual information to calculate what nouns a verb can take as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993